home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Software Vault: The Gold Collection
/
Software Vault - The Gold Collection (American Databankers) (1993).ISO
/
cdr46
/
strx221.zip
/
STR.DOC
< prev
next >
Wrap
Text File
|
1993-05-18
|
63KB
|
1,657 lines
Copyright (c) 1993 by Roy S. Woll
Class "str", Version 2.2 5.16/93
You may distribute and sell any executable which results from using this code
in your applications. You may redistribute this source freely as long as you
leave all files in their original form, including the copyright notice as is.
You may NOT include any SOURCE code of this software with any program that is
sold.
I would sincerely welcome any comments/criticism/ideas you might have about
the str or the regular expression class.
Registration:
-------------
If you decide to use this product, you must register by one of the
following two methods.
Online-registration:
--------------------
You can also register strX directly on Compuserve by going to
the SHAREWARE REGISTRATION section and looking for the product
strX (Registration ID 925)
Mail
----
Register by sending $15.00 to
Roy S. Woll, 1032 Summerplace Dr., San Jose, CA 95122.
By registering you will receive an enhanced version of the class that
includes context sensitive regular expressions, and more extensive
documentation.
In addition those of you who register will receive a more powerful
version of the regular expression class that includes context-sensitive
regular expressions. For instance you will easily be able to search or
replace a specific portion (flagged by '@') of a regular expression.
regX employeeX("Pay to the order of @[A-Za-z\\s]+$");
str paycheck("Payroll\nPay to the order of Roy S. Woll\n$50,000");
str employee;
paycheck.search(employeeX, &employee);
paycheck.replace(employeeX, "a lucky person");
//
// After executing the above code, employee will contain the
// name of the person following the text "Pay to the order of ".
//
// employee = "Roy S. Woll"
// paycheck = "Payroll\nPay to the order of lucky person\n$50,000"
//
Support:
--------
------------------------------------------
| |
| Roy S. Woll |
| 1032 Summerplace Dr. |
| San Jose, CA 95122 |
| |
| CompuServe : 76207,2541 |
| |
| Phone: (408) 778-2000 x4518 (day) |
| (408) 293-5893 (evening) |
| |
------------------------------------------
------------------------------------------------------------------------------
FILES: THE FOLLOWING FILES ARE INCLUDED.
-----------------------------------------
str.doc Documentation file for str class
str.h Interface file for str class
regX.h Interface file for regular expression class
regXimp.h Interface file used only for implementation of regX
dynstream.h Interface file for dynstream class
bcstr.h Interface file for BCstr class. BCstr is compatible
with the Borland object-based container classes. It
is derived from str.
str.cpp Implementation file for str class
regX.cpp Implementation file for regular expression class
dynstream.cpp Implementation file for dynstream class
match.cpp Regular expression compiling and searching routines
strsearch.cpp Member functions relating to search/replace
bcstr.cpp Implementation file for BCstr class
strcmp.cpp Non-ansi string routines used by str class. Add this
to your library if your system does not have these
(stricmp, strnicmp, strupr, strlwr).
grep.cpp Demo program for "str" class, supporting file
searching of regular expression matches. Supports
wildcard file specifications, case sensitivity, line
numbers, etc.)
makefile This program defines how to build str.lib and
grep.exe
readme Brief overview
1 GENERAL OVERVIEW AND DESIGN GOALS
-----------------------------------
From the beginning this string class was designed to maximize usability and
efficiency. The following is a breakdown of some of the design objectives.
Composition: One of the most common operations dealing with strings is to
compose a string from other data. The user would like to have the
flexibility to naturally express how data is inserted into the composed
string. Class ostream from the iostream library has one of the most
consistent and natural ways of transferring data I have yet seen. Streams
already provide all the functionality for converting built-in and user-defined
types to a stream of char. Thus complete interoperability with class
"ostream" from the iostream library was a primary goal.
Efficiency through reference counting: Every time you pass or return an
object by value, a temporary copy is made of the original object. Temporary
object creation for string objects can be an expensive operation. To increase
the efficiency for making copies of "str" objects, reference counting should
be used. By using reference counting, temporary object creation becomes a
very cheap operation. Only a pointer is copied to create the new string,
instead of allocating a new pointer, copying the character buffer, and then
deallocating the pointer.
Efficiency through user-definable memory allocation: Allocating and
deallocating memory can be an expensive operation so it should be minimized.
This should be handled by allocating a new data buffer only if the old buffer
is too small to store the new data. The user should be given the flexibility
to define on an instance basis how much memory to allocate initially and when
the string buffer overflows.
Search, replacement, and case sensitivity: Searching and replacement of
literal or regular expressions within a string should be supported. Searching
should use the case sensitivity of the string being compared. Case
sensitivity should also extend to all the relational operators.
2 INSTALLATION AND USE:
-------------------------
2-1 STR.LIB
-----------
Type "make" to compile the source and create a library called
str.lib. If you wish to place the object files in your own
library, insert the .obj files into your library. You may also
want to place "str.h", and "regX.h" into your default include
path. Bcstr.h is provided for those who wish to use the Borland
object-based container classes to store str's.
If you are using Turbo C++ instead of Borland C++, edit the
makefile and substitute "TCC" for "BCC".
Unix, Vax-Vms, and some other systems may also need to add
strcmp.obj to the library. this module defines non-ansi string
routines used by the str class. Add this to your library if your
system does not have these (stricmp, strnicmp, strupr, strlwr).
Do not add these to your library, if your system already defines
these (ie. Borland compilers).
2-2 GREP
--------
Type make grep.exe to create the executable for grep. Grep is
included as a demonstration program for the str class. It
supports searching of literal and regular expressions within
files. Wildcard file specifications, case sensitivity, line
numbers, etc. are all supported. The implementation uses only
around 1 page of ode, which demonstrates how natural coding is
when using the regular expression capabilities of the string
class.
2-3 UPGRADING
-------------
If you are upgrading from version 1, then you will need to
recompile all .cpp files that use the str class. This must be
done since str.h has changed. You also will need to change
occurrences of pad or strip to use the global versions. This must
be done since the member functions pad and strip now modify their
object. See section "Whats changed in 2.00".
2-4 USING
---------
You will need to include <str.h>, "str.h" in order to use class
str. If you also wish to use regular expressions include
<regX.h>, "regX.h". Header files "dynstream.h" and "regximp.h"
are strictly for implementation. and as such are separated into
other header file. You should never reference them unless you
wish to modify their implementation, or derive a new class from
them.
3 WHATS DIFFERENT ABOUT VERSION 2.0 - 2.2
-----------------------------------------
3-0 WHATS DIFFERENT ABOUT VERSION 2.2
---------------------------------
Fixed substr assignment problem. --> str = substr
Add member function read, lowercase,
uppercase, and variations of pad and strip.
3-1 WHATS DIFFERENT ABOUT VERSION 2.1
-------------------------------------
Version 2.1 extends regular expression support to include context
sensitive regular expressions. See section on regular expressions
for more details.
3-2 WHATS CHANGED IN 2.12
------------------------------
Fix - substr assignment problem. --> str = substr
Fix - case sensitivity problem, member function index was backwards.
3-3 WHATS NEW IN 2.02 and 2.11
------------------------------
Friend operator >> for reading in strings now directly uses the
string buffer, so as to remove the 256 character limit.
Grep now supports files in other drives and directories.
Optimizations to efficiency in str::_assign which is used
by many str member functions.
Regular expression character sets can now contain octal characters.
Fix - Member function "remove" now transfers only necessary
characters. May have caused Windows application error
previously.
4 WHATS DIFFERENT ABOUT VERSION 2.0
-----------------------------------
4-1 WHATS NEW IN 2.00
---------------------
1. Searching and replacing of character strings and regular
expressions.
2. Case sensitivity now is a property of each instance of str. All
searching and comparing for the str instance automatically
reflects its case sensitivity. During comparisons between two
strings, the case sensitivity of the first argument is used.
Instances of str modify their case sensitivity through member
functions setCaseSensitive(int).
{
str a=("abcd efgh");
a.setCaseSensitive(0); // a is now case insensitive
str b=("ABCD EFGH");
cout << a.search("efGH"); // "1" Found
cout << b.search("efGH"); // "0" Not found
cout << (a==b); // "1" Are equal
cout << (b==a); // "0" Not equal
}
3. Miscellaneous optimizations and fixes.
4-2 WHATS GONE IN 2.00
----------------------
Member function iindex is not directly supported. Instead use
member function setCaseSensitive(int) to tell a string instance if
its searching and comparing should be case sensitive or not. The
default is case sensitive.
4-3 WHATS CHANGED IN 2.00
-------------------------
Member functions pad and strip now modify their object. Use
global functions pad and strip to just return a value. These
funtions were changed so that there would be a consistency among
member functions. If the member function makes sense to modify
its object, then it will. I apologize for those of you who will
have to change your code to reflect this. But your code will be
more readable after you make the modifications. You'll have to
change occurrences such as the following.
Old Way New Way
------------------ ------------------
str a="abc"; str a = "abc";
str b = a.pad(10); str b = pad(a, 10);
5 SEARCHING/REPLACING
---------------------
Member functions index, search, replace, and replaceAll are
provided to find and replace user-defined patterns. There are
various forms of each, and can be summarized as follows.
------------------------------------------------------------------
int index( pattern [,matchLen] [,start] );
Find the next occurrence of pattern in this string.
Returns position where match occurs or -1 if not match is found.
pattern: pattern can be either a (const char *) or a regX.
matchLen: Only allowed if pattern is a regX. matchLen is the
address of a str where the length of the match is saved.
Optional Field.
start: Can be either a (int *) or an (int). If it is an (int),
then its used to determine where the search is to begin.
If it is an (int *) then its used to determine where the
search is to begin, and its updated to the position
where the match is found. Optional Field where the
default is to start at the beginning of the string.
------------------------------------------------------------------
int search( pattern [,matchPtr] [,start] );
Find the next occurrence of pattern in this string.
Returns true if match is found.
pattern: pattern can be either a (const char *) or a regX.
matchPtr: Only allowed if pattern is a regX. matchPtr is the
address of a str where the match is saved.
Optional Field.
start: Can be either a (int *) or an (int). If it is an (int),
then its used to determine where the search is to begin.
If it is an (int *) then its used to determine where the
search is to begin, and its updated to the position
where the match is found. Optional Field where the
default is to start at the beginning of the string.
------------------------------------------------------------------
int replace( pattern, replaceStr [,start] [,numReplace])
Replace occurrences of pattern with replaceStr.
Returns number of actual replacements.
pattern: pattern can be either a (const char *) or a regX.
replaceStr: pattern is replaced by replaceStr
start: Can be either a (int *) or an (int).
If it is an (int), then its used to determine where
the search is to begin. If it is an (int *) then
its used to determine where the search is to begin,
and its updated to the position immediately after the
location where the replacement occurred.
Optional Field where default is to start at the
beginning of the string.
numReplace: Maximum number of replacements to perform.
Optional Field where default is 1.
------------------------------------------------------------------
int replaceAll( pattern, replaceStr [,start])
Replace all occurrences of pattern with replaceStr
Returns number of actual replacements.
pattern: pattern can be either a (const char *) or a regX.
replaceStr: pattern is replaced by replaceStr
start: Can be either a (int *) or an (int).
If it is an (int), then its used to determine where
the search is to begin. If it is an (int *) then
its used to determine where the search is to begin,
and its updated to the position immediately after the
location where the replacement occurred.
Optional Field where default is to start at the
beginning of the string.
------------------------------------------------------------------
6 REGULAR EXPRESSIONS
---------------------
Regular expressions are a powerful form of searching and replacing
text. Instead of pattern matching a literal character string,
they can match a more general pattern expression. The regular
expression class obeys the following pattern rules.
6-1 ONE CHARACTER PATTERN RULES:
--------------------------------
1. All characters except ( " * + ? . [ ] & $ @ \" ) represent
themselves.
2. Special characters preceded by a backslash "\", represent the
literal character. However the following characters when
preceded by a backslash have special meaning.
\b backspace
\f formfeed
\n newline
\r carriage return
\t tab
\e escape
\s space
\^ control-character
\xddd character code in hex
\ddd character code in octal
\ literal character code
3. Period represents any character, with the exception of new line.
Use [^] to cross line boundaries.
4. Brackets, [ and ], enclose a set of characters. The set of
character represents any one of its constituents, or any single
character not in the given sequence if the sequence starts with
^. Within the sequence, - between two characters denotes the
inclusive range. For example, [a-z] represents any lower-case
letter, [^0-9] represents any non-digit character, [aeiou]
represents any vowel.
6-2 MULTI-CHARACTER PATTERN RULES
---------------------------------
1. If * follows a one character pattern, it indicates that the
previous pattern may appear arbitrarily often, or even not at all
(0 or more occurrences).
2. If + follows one of these pattern parts, it indicates that the
previous character pattern appears at least once (1 or more
occurrences)
3. If ? follows one of these pattern parts, it indicates that the
previous character pattern has zero or one occurrence.
6-3 ANCHORS FOR REGULAR EXPRESSION
----------------------------------
1. ^ at the beginning of a pattern represents the beginning of an
input line. $ at the end of a pattern represents the end of
an input line.
2. @ flags what part of the regular expression is actually part of
the match. For example the regular expression "[0-9]+\.@[0-9]+"
will match the fractional part of a floating point number. More
details are described later.
6-4 AMBIGUITY RULES FOR SEARCHING AND REPLACING
-----------------------------------------------
Chooses the pattern that represents the longest possible match.
For instance consider the following example,
regX pattern("Numbers .*[0-9]+");
str a("Numbers 34,67");
a.search(pattern, &match); // match = "Numbers 34,67
Both "Numbers 34,67" and "Numbers 34" match the pattern, but
"Numbers 34,67" is chosen since it is longer.
6-5 CONTEXT SENSITIVE REGULAR EXPRESSIONS
-----------------------------------------
Quite often one knows the context of what is being looked for, but
does not have any natural way of expressing this in a regular
expression format. Consider the following problem.
Problem:
Write a program that parses a paragraph and removes the comma
from all numbers of the form (ddd,ddd,...). Commas used in
any other context should not be affected.
Solution:
void RemoveNumberComma(str * buffer)
{
regX commaInDigitX("[0-9]@,@[0-9]");
buffer->replaceAll(commaInDigitX, "");
};
str a("Hi, this is test number 3, and the year is 1,992.");
RemoveNumberComma(&a);
The above code replaces the number 1,992 with 1992. The regular
expression commaInDigitX is context aware. The user is not
forced to distinquish for himself if the comma is in a digit or
not. Without context aware regular expresions, the above problem
is more difficult. The programmer would be required to search for
a comma, and then check the next and previous characters to see if
they were a digit before performing the replacement.
Here is another example.
regX AuthorX("Please register by sending $15 to @[A-Za-z\\s]+$");
str registration("Please register by sending $15 to Roy S. Woll");
str author;
registration.search(AuthorX, &author); // author = "Roy S. Woll"
6-6 REGULAR EXPRESSION EXAMPLES:
--------------------------------
{
regX number("[0-9]+");
regX anyWhiteSpace("[\t\\s]+");
regX leadingWhiteSpace("^[\t\\s]+");
//
// replacement
//
str a("This year is 1992");
a.replace( number, "1993"); // a = "This year is 1993"
a = " \tA great string class";
a.replace( leadingWhiteSpace, "");// a = "A great string class"
a = "A great string \t class";
a.replaceAll( anyWhiteSpace, " ");// a = "A great string class"
//
// searching
//
str match;
a = "This year is 1992";
if (a.search(number)) // "found"
cout << "found" << endl;
a.search(number, &match) // match = "1992"
a = "My wife was born on April 12, 1968.";
int pos=0;
a.search(number, &match, &pos); // match = "12";
pos+=match.length();
a.search(number, &match, &pos); // match = "1968";
}
6-7 OPTIMIZATION HINTS FOR REGULAR EXPRESSIONS
----------------------------------------------
Since regular expression compilation (during construction or
assignment) is a relatively slow operation, you should try to
minimize them by declaring them as static if at all possible.. In
this way they will only be compiled once.
7 MEMORY ALLOCATION CONTROL
---------------------------
To reduce the overhead of continually creating and destroying str
objects, the str class allows the user to customize how much
memory to allocate initially and when the string buffer is full.
The original buffer is used until an operation causes it to grow
beyond its original size. The new size is the original size plus
the increment size. For example consider the following code
fragment.
{
str example2("abcdefg", 10, 15);
// An instance of str is defined to contain "abcdefg".
// Its initial buffer size is 10 characters, and it grows
// by 15 bytes when it needs to expand.
example2 = "123";
// Assign "123" to example2. No memory reallocation
// is necessary since the new contents still fit in
// the original buffer.
example2 = "0123456789012";
// Assign "0123456789012" to example2. Memory reallocation is
// necessary since the new contents exceed the original size of
// ten characters. This assignment causes a new buffer
// to be created that supports (10+15) bytes.
}
The Default value for the initial memory allocation size is the
length of the first value being stored. The Default value for the
memory expansion increment is 256 characters. If you'd rather have
the string truncate rather than expand, use an increment of 0. If
you would like to change the default, then edit the str.h file and
modify the constant str_default_memincr.
8 INTEROPERABILITY WITH OSTREAM
-------------------------------
Of primary importance in the design of the class was to allow
complete interoperability with class "ostream" from the iostream
library. Class "str" supports all the ostream operations,
including complete usage of the I/O manipulators. Streams already
provide all the functionality for converting built-in and
user-defined types to a stream of char. Rather then trying to
duplicate this, the str class works with the stream class.
For example, the following module takes 3 integers and returns a
time specification into a string of the format "hh:mm:ss".
str getTimeStr(int hour, int minute, int second)
{
str timestr;
timestr.stream() << setfill('0') << setw(2) << hour
<< ":" << setfill('0') << setw(2) << minute
<< ":" << setfill('0') << setw(2) << second;
return timestr;
};
This method has significant advantages over class "ostrstream".
Class "ostrstream" has the disadvantages of not being
interchangeable with "const char *", not having control over
memory allocation/reallocation, and not supporting string
operators and member functions. I also found it significantly
more cumbersome to use.
Class "str" gives the user full access to ostream's capabilities,
while maintaining consistency with the str's buffer.
9 REFERENCE COUNTING
--------------------
To increase the efficiency for making a copy of a "str" object,
reference counting is used. You may think you rarely make copies
of str objects, but that is probably not the case. Every time you
pass or return an object by value, a temporary copy is made of the
original object. Temporary object creation for string objects can
be an expensive operation.
By using reference counting, temporary object creation becomes a
very cheap operation. Only a pointer is copied to create the new
string, instead of allocating a new pointer, copying the character
buffer, and then deallocating the pointer.
For example:
{
(1) str a = "just some string ";
(2) str b = a;
(3) str c = a + b;
(4) str c = strip(a);
}
Statement (1) creates a str object containing "just some string".
Statement (2) copies a to b. With reference counting only a
pointer is copied. Without reference counting all the character
data is transferred. In statement (3) a temporary str object is
returned by (a+b) and copied to c. Using reference counting only
the pointer for the temporary is copied, instead of the whole
character buffer. Statement (4) is just like statement (3) where
strip(a) returns a temporary which is copied to c.
However reference counting itself can present some efficiency
problems. The popular scheme of having a reference pointer, and a
data pointer has the disadvantage of slowing down operations for
singly referenced objects. This is caused by the need to allocate
two unique pointers, and the extra level of indirection when
accessing the character buffer.
The "str" class uses a single pointer that points to block that
contains both the reference data and the character data. In this
way at most one memory allocation is done per operation, making
the execution times for creating singly referenced objects
comparable to classes that don't use reference counting.
STR REFERENCE
constructor str(void);
Construct a str where the memory allocation is determined by the
first assignment to the instance, and then it grows by
str_default_memincr.
constructor str(int bufsize, int memincr= str_default_memincr);
Construct a str that allocates a (bufsize) byte buffer during the
first assignment, and thereafter grows by (memincr) bytes when
buffer is full.
constructor str(const char * s, int bufsize=0, memincr=str_default_memincr);
constructor str(const str& s, int bufsize=0, memincr=str_default_memincr);
Construct a str containing (s). It allocates a (bufsize) byte
buffer during the first assignment (or the length of (s) if
larger than bufsize), and thereafter grows by (memincr) bytes when
the buffer is full. Use (memincr=0) to prevent the string from
growing.
Constructor examples:
//
// Memory allocation determined by first assignment
// to instance, and then grows by str_default_memincr.
//
str mystr;
//
// Defines 10 instances of str.
// Memory allocation determined by first assignment
// to instance, and then grows by str_default_memincr.
//
str str_array[10];
//
// Define str containing "Some demo text" 100 byte buffer space
// allocated, and then grows by 200 bytes when buffer is full.
//
str mystr("Some demo text", 100, 200);
//
// Define (mystr2) to contain substr formed from four characters
// of (mystr1) starting at position 1.
// mystr2 will contain "trin"
//
str mystr1("string 1");
str mystr2(mystr1(1,4)); // "trin"
//
// Define str AnotherStr containing mystr
//
str AnotherStr(mystr);
MEMBER FUNCTION
OPERATORS
const char * operator const char * () const;
Return pointer to this str's character buffer. Compiler will
automatically call this when a cast to a (const char*) is
necessary. Instances of str can be used interchangeably with
const char *.
For example, when using the ansi C str library.
//
// find location of "234" in MyString
//
// foundptr = "2345 012345"
//
str MyString("012345 012345");
char * foundptr = strstr(MyString, "234");
//
// However you will not be allowed to use str
// interchangeably with (char *).
//
str MyString("012345");
strcpy(MyString, "hello"); //compiler type-checking error.
() const char * operator()() const;
Return pointer to this str's character buffer.
(int) operator()(int index) const;
Return pointer to this str's character buffer starting at
position (index).
//
// find location of "234" in MyString starting at offset 5
//
// foundptr = "2345"
//
str MyString("012345 012345");
char * foundptr = strstr(MyString(5), "234");
(int,int) substr operator()(int pos, int num);
Substr operations supported using (int,int) notation. This member
function has two uses. It can be used to extract a substring
from a given string by using it on the right-hand-side of an
expression. It extracts (num) characters starting at offset (pos)
of the string. For example the following code fragment
concatenates two substrings.
{
str myString("abdefghijklmnopqrstuvwxyz");
cout << myString(0,3) + myString(23,3); // "abcxyz";
}
This member function can also be used to assign a selected region
of a string when used on the left-hand-side of an expression. It
replaces (num) characters starting at offset (pos) of the string
with the left-hand-side of an expression.
For example, the following code fragment replaces "test" with
"survey" by using a substr operation.
{
str test_substr1("This is a test");
int pos = test_substr1.index("test");
if (pos>=0) test_substr1(pos, 4) = "survey";
}
[] operator char& operator[](int position);
Return reference to character buffer at offset position.
Example []:
//
// Can access str as an array of char for assignment purposes.
//
str MyString("012345");
MyString[4]='0'; // MyString = "012305"
[] operator char operator[](int position) const;
Return character at offset position of character buffer. This
version is only called if the calling str object is a constant
object. It is provided because it is significantly faster than
the non-const version of operator [].
Example []:
//
// Can access str as an array of char for retrieval purposes.
//
const str MyString("012345");
cout << MyString[4]; // writes '4' to screen
= str & operator = (const str & s); // s = str;
str & operator = (const substr & s); // s = substr;
str & operator = (const char * s); // s = charptr
str & operator = (const char s); // s = character
Assign (s) to this str.
Example:
{
str s,t;
t = 'a'; // t = "a"
t = "abc"; // t = "abc"
s = t; // s = "abc"
s = t(1,2); // s = "bc"
};
+= str & operator += (const str & s); // s += str
str & operator += (const char * s); // s += charptr
str & operator += (const char s); // s += char
Concatenate (s) to this str
Example:
{
str s;
str t="123";
s += 'a'; // s = "a"
s += "abc"; // s = "aabc"
s += t; // s = "aabc123"
s += t(1,2); // s = "aabc12323"
};
<< str & operator << (const char * s); // s << charptr
str & operator << (const str& s); // s << str
str & operator << (const int s); // s << int
str & operator << (const char s); // s << char
Concatenate (s) to this str. Operator "<<" is only provided
because it has a more natural associativity (left to right) than
operator "+=" when concatenating a series.
Examples:
{
str s;
str t="123";
int i=99;
s << 'a'; // s = "a"
s << "abc"; // s = "aabc"
s << t; // s = "aabc123"
s << i; // s = "aabc12399"
s << t(1,2); // s = "abbc1239923"
};
{
str testconcat1(" there in ");
int year = 1992;
str test;
test << "hello" << testconcat1
<< year << '.'; // test = hello there in 1992.
}
+ str operator+(const str &) const;
str operator+(const substr &) const;
str operator+(const char * b) const;
str operator+(const char b) const;
friend str operator+(const char *, const str &);
Concatenate s1 and s2 and return the result.
{
str a("123");
str s = a + "abc"; // s = "123abc"
}
Notes on efficiency: For optimal perfomance, avoid cascading
operator "+" since temporary objects will be created. Use
operators "+=", "<<" , or the stream() member function. Operator
"+=" and "<<" are more efficient than stream(), since stream()
must create an instance of dynstream the first time it is used,
but they are not as flexible or convenient.
FRIEND/GLOBAL
FUNCTION OPERATORS
==,!=,<= int operator==(const str & s1, const str & s2)
>=,<,> Equal to, not equal to, less than, etc. relational operators
supported. Comparison of s1 and s2 is determined by the case
sensitivity of s1.
Example for relational operators:
//
// The following code fragment will yield an output of
// "010101", signifying false, true, false, true, false, true.
//
str a("abc");
str b("def");
cout << (a == b);
cout << (a != b);
cout << (a >= b);
cout << (a <= b);
cout << (a > b);
cout << (a < b);
MEMBER
FUNCTIONS
assign str & assign (const char * source, int len)
Assign (len) characters from (source) to this str.
Example:
str a;
str b="0123456789";
a.assign(b(5), 5); // a = "56789";
caseSensitive int caseSensitive(void) const;
Returns the case sensitivity for the current str object.
index int index(const char * s, int start=0) const;
Search this str for character string (s) and return the offset
where a match occurs. Search starts at offset (start). Return
-1 if no match is found. Case sensitivity is determined by this
str instance. Case is sensitive by default when the str is
created, but can be overridden through member function
setCaseSensitive(int).
index int index(const regX& reg, int start=0) const;
Search this str for regular expression (reg) and return the offset
where a match occurs. Search starts at offset (start). Return
-1 if no match is found.
//
// Search for the first floating point number
//
// pos = 20
//
{
regX floatingPointNumber ("[0-9]+[.][0-9]+");
str a = "pi is approximately 3.14159265";
int pos = a.index(floatingPointNumber);
}
index int index(const regX& reg, int * matchLen, int start=0) const;
Same as index(const regular&, int) but returns the length of the
matched string in *matchLen.
//
// Search for the first floating point number
//
// pos = 20
// matchlen = 10
//
{
regX floatingPointNumber ("[0-9]+[.][0-9]+");
str a = "pi is approximately 3.14159265";
int matchlen;
int pos = a.index(floatingPointNumber, &matchlen);
}
insert int insert(int pos, char ch);
Insert char (ch) starting at offset (pos) of this str. Insertion
can fail if the str is already full, and the str is not allowed to
expand (constructed with memincr=0).
{
a = "abcdefghi";
a.insert(5, '1'); // a = "abcde1fghi"
}
insert int insert(int pos, const char * insertStr);
Insert character buffer insertStr starting at offset (pos) of this
str. Insertion can fail if the insertStr would cause this str to
overflow, and the str is not allowed to expand (constructed with
memincr=0).
{
a = "abcdefghi";
a.insert(5, "12"); // a = "abcde12fghi"
}
length int length(void) const;
Return current string length of buffer.
{
str MyString("abc");
cout << MyString.length(); // writes 3 to screen.
}
pad str& pad(int padsize, int t=right, char padchar = ' ');
Pad (padchar) characters to the right and/or left of this str
yielding a string of length (padsize). The original string is
modified. The padtype can be one of (right, left, or both).
//
// This code performs the following.
//
// a = "hello there "
// a = "*********************hello there"
// a = " hello there "
//
//
str a("hello there");
a.pad(32);
a = "hello there";
a.pad(32, str::left,'*');
a = "hello there";
a.pad(32, str::both);
remove void remove(int pos, int numdel=1);
Remove (numdel) characters starting at position (pos) of this str.
{
a = "abcdefghi";
a.remove(5, 2); // a = "abcdehi"
}
replace int replace(const regX& reg, const char * replaceStr, int
start=0, int numReplacements=1);
Replace occurrences of the pattern (reg) with (replaceStr).
Replacement begins at offset (start) of this string, and at most
(numReplacements) replacements are performed. The number of
actual replacements is returned.
//
// Replace first occurance of whitespace with a single blank
//
// a = "A great string class"
//
regX whiteSpace("[\t ]+");
str a = "A \t great string class";
a.replace(whiteSpace," ");
replace int replace(const regX& reg, const char * replaceStr, int*
startPtr, int numReplacements=1);
Replace occurrences of the regular expression (reg) with
(replaceStr). Replacement begins at offset (*startPtr) of this
string, and at most (numReplacements) replacements are performed.
The number of actual replacements is returned. (*startPtr) is
updated to begin after the matched pattern or set to -1 if no
match found.
//
// Replace all whitespace with a single blank.
//
// a = "A great string class"
//
regX whiteSpace("[\t ]+");
str a = "A great string \t class";
int pos=0;
while (a.replace(whiteSpace, " ", &pos));
replace int replace(const char * pattern, const char * replaceStr,
int start=0, int numReplacements=1);
Replace occurrences of the character string (pattern) with
(replaceStr). Replacement begins at offset (start) of this
string, and at most (numReplacements) replacements are performed.
The number of actual replacements is returned.
//
// Replace 2 occurrences of "/" with " " starting at pos 3
//
// a = "A/great string class"
//
str a = "A/great/string/class";
int pos=0;
a.replace("/", " ", 3, 2);
replace int replace(const char * pattern, const char * replaceStr,
int * startPtr, int numReplacements=1);
Replace occurrences of the character string (pattern) with
(replaceStr). Replacement begins at offset (start) of this
string, and at most (numReplacements) replacements are performed.
The number of actual replacements is returned. (*startPtr) is
updated to begin after the matched pattern or set to -1 if no
match found.
//
// Replace all occurrences of "/" with " "
//
// a = "A great string class"
//
a = "A/great/string/class";
int pos=0;
while (a.replace("/", " ", &pos));
replaceAll int replaceAll(const char * pattern, const char * replaceStr,
int pos=0);
Replace all occurrences of the character string (pattern) with
(replaceStr). Replacement begins at offset (start) of this
string. The number of actual replacements is returned.
//
// Replace all occurrences of "!" with " "
//
// a = "A great string class"
//
str a = "A!great!string!class";
a.replaceAll("!", " ");
replaceAll int replaceAll(const regX& reg, const char * replaceStr,
int pos=0);
Replace all occurrences of the regular expression (erg) with
(replaceStr). Replacement begins at offset (start) of this
string. The number of actual replacements is returned.
//
// Replace all whitespace with a single blank
//
regX whiteSpace("[\t ]+");
str a = "A great string \t class";
a.replaceAll(whiteSpace," "); // a = "A great string class"
search int search(const char * pattern, int *startPtr) const;
Search for the character string (pattern) and return 1 if
(pattern) is found. Searching begins at offset (*startPtr) of
this string. (*startPtr) is updated to the starting position of
the match or set to -1 if no match is found.
{
str a = "I love snow. God is love.";
str love = "love";
int pos = 0;
if (a.search(love, &pos)) // pos = 2
cout << "there is love"; // "there is love"
pos+= love.length();
if (a.search(love, &pos)) // pos = 20
cout << "there is more love"; // "there is more love"
}
search int search(const char * pattern, int start=0) const;
Search for the character string (pattern) and return 1 if
(pattern) is found. Searching begins at offset (start) of this
string.
{
str a = "I love snow";
if (a.search("love"))
cout << "there is love"; // "there is love"
}
search int search(const regX& reg, int *startPtr) const;
Search for the regular expression reg in this string and return 1
if (reg) is found. Searching begins at offset (*startPtr) of this
string. (*startPtr) is updated to the starting position of the
match, or -1 if no match is found.
{
regX number("[0-9]+");
str a = "John 3:16";
int pos=0;
a.search(number, &pos); // pos = 5;
}
search int search(const regX& reg, int start=0) const;
Search for the regular expression reg in this string and return 1
if (reg) is found. Searching begins at offset (start) of this
string.
{
regX number("[0-9]+");
a = "John 3:16";
if (a.search(number))
cout << "found number"; // "found number"
}
search int search (const regX&, str * matchPtr=0, int start=0) const;
Search for the regular expression reg in this string and return 1
if (reg) is found. Searching begins at offset (start) of this
string. The matched pattern is saved as (*matchPtr).
{
regX number("[0-9]+");
str a = "Bobby Fisher: world champion in 1993?"
str match;
a.search(number, &match); // match = "1993"
}
search int search(const regX& reg, str* matchPtr, int* startPtr)
const;
Search for the regular expression reg in this string and return
0/1 if (reg) is found/not found. Searching begins at offset
(*startPtr) of this string. (*startPtr) is updated to the
starting position of the match, or -1 if not match is found. The
matched pattern is saved as (*matchPtr).
{
regX number("[0-9]+");
str month, day, year;
str a = "My wife's birthday is 4/12/1968.";
int pos=0;
a.search(number, &month, &pos); // month = "4";
pos+= month.length();
a.search(number, &day, &pos); // day = "12";
pos+= day.length();
a.search(number, &year, &pos); // year = "1968";
}
size int size(void) const;
Return current size of memory allocated for buffer.
{
str MyString("abc", 80);
cout << MyString.size(); // writes 80 to screen.
}
setCaseSensitive void setCaseSensitive(int val);
Set the case sensitivity for the current str object. If (val) is
0 then string comparisons and searches will not be case sensitive.
If (val) is 1 then they will be case sensitive.
stream ostream& stream(void);
Return ostream for this str. Consult your iostream documentation
for details on using an ostream.
Examples stream():
//
// getTimeStr() converts a time specification into a string of
// the format "hh:mm:ss" using leading zeros.
//
str getTimeStr(int hour, int minute, int second)
{
str timestr;
timestr.stream() << setfill('0') << setw(2) << hour
<< ":" << setfill('0') << setw(2) << minute
<< ":" << setfill('0') << setw(2) << second;
return timestr;
};
//
// Use stream operation to concatenate two strings.
//
str a;
a.stream() << "Hello there."; // a = "Hello there"
a.stream() << " Goodbye." // a = "Hello there. Goodbye."
stream ostream& stream(int p);
Return ostream for this str and move the ostream put pointer to
offset (p). Same functionality as stream(void) except that the
user can change the stream put pointer.
Example stream(int):
str a;
a.stream() << "Hello there."; // a = "Hello there."
a.stream(0) << "Hello again."; // a = "Hello again."
strip str& strip(int striptype=trailing, const char * stripchars= "
\t");
Strip leading and/or trailing characters from the str and return
the resulting str. The original string is modified. The strip
type (striptype) can be one of (leading, trailing, or both).
(stripchars) is a character string that contains a set of
characters to strip. The default is to strip trailing spaces and
tabs.
Example strip:
str origstr("********hello there ");
str a = origstr;
a.strip(); // a = "********hello there "
a = origstr;
a.strip(str::leading,"*"); // a = "hello there "
a = origstr;
a.strip(str::both," *"); // a = "hello there"
strip str& strip(int striptype=trailing, char stripchar);
Strip leading and/or trailing characters (as defined by stripchar)
from the str and return the resulting str. The original string is
modified. The strip type (striptype) can be one of (leading,
trailing, or both).
//
// This code performs the following.
//
// a = "********hello there"
// a = "hello there"
//
str a("********hello there ");
a.strip(str::trailing,' ');
a.strip(str::leading,'*');
FRIEND/GLOBAL
FUNCTIONS
>> friend istream& operator >> (istream&, str &);
Overload istream input operator to support str objects.
str a;
cin >> a; // read string from keyboard and store in a
<< friend ostream& operator << (ostream&, const str &);
Overload ostream input operator to support str objects.
str a("hello there");
cout << a << endl; // write string to screen
lowercase str lowercase(const char *);
Return lower case of character buffer
{
str a("Abc");
cout << lowercase(a); // outputs "abc"
}
pad str pad(const char * s, int padsize,
int t=str::right, char padchar = '
');
Pad (padchar) characters to the right and/or left of s, yielding a
string of length (padsize). The padtype can be one of (right,
left, or both).
//
// This code performs the following.
//
// a = "hello there "
// a = "*********************hello there"
// a = " hello there "
//
str origstr("hello there");
str a = pad(origstr, 32);
a = pad(origstr, 32, str::left,'*');
a = pad(origstr, 32, str::both);
strip str strip(const char * s, int striptype=str::trailing,
const char * stripchar);
Strip leading and/or trailing characters from (s) and return the
resulting str. The strip type (striptype) can be one of (leading,
trailing, or both). (stripchars) is a character string that
contains a set of characters to strip. The default is to strip
trailing spaces and tabs.
Example strip:
//
// This code performs the following.
//
// a = "********hello there "
// a = "hello there "
// a = "hello there"
//
str origstr("********hello there ");
str a = strip(origstr);
a = strip(origstr, str::leading,"*");
a = strip(origstr, str::both," *");
strip str strip(const char * s, int striptype=trailing, char stripchar);
Strip leading and/or trailing characters (as defined by stripchar)
from s and return the resulting str. The strip type (striptype)
can be one of (leading, trailing, or both).
//
// This code performs the following.
//
// a = "********hello there"
// a = "hello there "
//
str origstr("********hello there ");
str a = strip(origstr, str::trailing,' ');
a = strip(origstr, str::leading,'*');
uppercase str uppercase(const char *);
Return upper case of character buffer
{
str a("Abc");
cout << uppercase(a); // outputs "ABC"
}
PROTECTED
STATIC MEMBER
FUNCTIONS
setDefaultCaseSensitive
void setDefaultCaseSensitive(int val);
Set the default case sensitivity to val. The default is used when
creating new instances of str.
regX CLASS REFERENCE
constructor regX(void);
Default constructor a regular expression. User should assign
regular expression using operator = before using it.
constructor regX(const char * regexp);
Create a regular expression defined by the pattern (regexp).
The regular expression is automatically compiled.
constructor regX(const regX& regexp);
Copy constructor for regular expressions.
= regX& operator=(const char * regexp);
Use the regular expression defined by regexp. The regular
expression is automatically compiled.
= regX& operator=(const regX& regexp);
Use the regular expression defined by regexp.
error int error(void) const;
Return 1 if there was an error in compiling the regular
expression.
index int index(const char * searchStr, int * matchLenPtr,
int start=0, int caseSensitive=1);
Search for this regular expression in the character buffer
searchStr starting at position start. If the match is found save
the length of the match in *matchLenPtr. Case sensitivity is
determined by caseSensitive). Return position where match is
found or -1 if no match is found.